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Abstract 

■ In this note, we explore a connection between the small-set expansion problem and 

O . a popular community finding approach for social networks, and observe that a sub- 

exponential time small-set expansion algorithm can be used to provide a sub-exponential 
time 2-approximation for hard instances of the community finding problem. 

> 
00 

O ■ 1 Introduction and Definitions 

All graphs considered in this note are undirected and unweighted. Let G = (V, E) denote the 
given input graph with n = |\^| nodes and m = \E\ edges, denote the degree of a node v E V, 
and A{G) = denote the adjacency matrix of G, i.e., au,v{G) = 1 if {u,v} G E and 

0'u,v{G) = otherwise. Since our result spans two research areas, we summarize the relevant 
^ ! definitions from both research fields [1, 5] for the benefit of the reader. 

A set of communities S is defined to be a partition of V. 

If G is d-regular for some given d, then its symmetric stochastic walk matrix is denoted 
by A{G), i.e., A{G) = ['^^AG)/d]. 

For a r G [0,1), the r-threshold rank of G, denoted by rank^(G), is the number of 
eigenvalues A of A{G) satisfying |A| > r. 

For a subset C 5* C \^ of nodes, the following quantities are defined: 
— The (normalized) measure of S is n{S) = l^l/n. 

{ {u, v} \ u E S, V ^ S, {u, v} G E^ 

The (normalized) density of 5* is D{S) = 1 — $(5*). 



The (normalized) expansion of S" is $(5*) 
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— The modularity of S is 



M{S) 




(1) 



• The modularity of a set of communities S is 



M(S) = 5^M(5) 



(2) 



• For a function /(n), exp(/(n)) denotes 2'^^'^'"^ for some fixed constant c > 0. 



2 Small Set Expansion 

The following results arc from [1], restated in our terminologies after instantiation of parameters 
with specific values and trivial algebraic simplification. 

Theorem 2.1. [1]^ Assuming ranki_io-5(G') < n°'^, there is an {exp {n^'^)po\j{n))-time al- 
gorithm that outputs a subset C 5 C V such that 0.92 \S*\ < \S\ < 1.08 \S*\ and ^{S) < 
^S*) + 0.08. 

Theorem 2.2. [1]^ Let H be a regular graph with r vertices. Assuming ranki_io-5(i?) > r°'^, 
one can find in poly(r) time a subset S of vertices of H such that \S\ < r^~^^ and ^{S) < 10~^. 

3 Modularity Maximization 

The goal of the community finding problem is to find a partition S that maximizes M(S). 
Let OPT = maxM(S) denote the optimal modularity value, and OPT2 denote the optimal 

modularity value when one is allowed at most 2 communities. It is easy to verify that < 
OPT < 1. For recent algorithmic complexity results for modularity maximization, see [3]. 

4 The Remark 

It is known that, for d-regular graphs, modularity maximization is NP-hard for every constant 
d > 9, and APX-hard for d = n - 4. 

Remcirk 4.1? Let G be a d-regular graph. Then, Theorems 2.1 and 2.2 imply that there exists 
a constant < £ < y2 such that there is an algorithm Ae with the following properties: 

(a) Ae runs in sub- exponential time, i.e., in time exp(5n) for some constant Q < 5 — 5{e) < 1 

that depends on e only. 

(b) Ae distinguishes instances with OPT >1 — e from instances with OPT < e. 

{Note that we make no claim if £ < OPT <1 — e.) 

^Instantiate Theorem 2.2 in [1] with 77 = lO"** and e = 10~^. 

^Instantiate Theorem 2.2 in [1] with r] = 10~* and 7 = 0.1. 

^We have made no significant attempts to optimize the constants in Remark 4.1. 
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5 Proof of the Remark 

Set e = 10-^ We assume that G is d-regular, and either OPT > 1 - IQ-^ or OPT < 10"^ 



Preliminary Algebraic Simplification 

Let S = (S*!, 5*2, . . . , Sk] be a set of communities. The objective function M(S) can be equiva- 
lently expressed as follows via simple algebraic manipulation [2, 4-6]. Let rrii denote the number 
of edges whose both endpoints arc in Si. rriij denote the number of edges one of whose endpoints 

is in Si and the other in Sj and = denote the sum of degrees of vertices in Si. Then, 

veSi 

We will provide an approximation for OPT2 and then use the result that OPT2 > ^^Y^ proved 
in [3]. Note that if if OPT < 10"^ then obviously OPT2 < 10-^ whereas if OPT > 1 - 10"^ 
then OPT2 > 0.5 - ip. 

Consider a partition S of F into exactly two sets, say S and S — V\S with < ijl{S) < 1/2. 
By Lemma 2.2 of [3] M{S) = M(^), and thus 

M(s) = 2 . - J ) = 2 X (i^j^ - ,(sy) = 2 X (D(SHS) - ,(sr) 

Thus, letting D = D(5'), // = ijl{S) and $ = ^{S) = 1 — D, our goal is to maximize the following 
function over all possible valid choices of D and //: 

/(/i,D) = 2x (/iD-/x2) = 2x 

Let S* = { S*, S* } be an optimal solution for the problem of partitioning into 2 communities, 
with D = D*, ^ = = (and thus OPT2 = f{ii\ D*) ). Obviously, 



/ ( y + (5, D* j = / ( y - (5, D* ) for any positive S>0 



Note that we need to show that, if OPT2 = /(a**, D*) > 0.5 — then there is an algorithm 
A as described in Remark 4.1 that outputs a valid choice of /i and D, say /i' and D', such that 
/(//', D') > 10-6. 

Guessing D* 

Note that there are at most 0{dn'^) choices for D* since D* is of the form '^/{jd) for j e 
{1, 2, . . . , "/s } and i e { 1, 2, . . . , j d }. In the sequel, we will run our algorithm for each choice 
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of D* and take the best of these solutions. Thus, it will suffice to prove our approximation 
bound assuming we have guessed exactly. 

In the remainder of the proof, we will make use of Theorems 2.1 and 2.2. The description 
is self-contained, and the reader will not need any prior knowledge of expansion properties of 
graphs. Remember that we assume that /(/U*, D*) > 0.5 — and thus fi* > 0.5 — Since 
H* < 1/2, this implies D* = + > ^^T^^TO^ > 1 - 10"^ and thus = 10"^ 

Case I: Small Threshold Rank of G, i.e., ranki_io-6(G) < n^^ 

We run the algorithm as outlined in Theorem 2.1, and return {-5, S"} as our solution. Note that: 

^S) < + 0.08 < 0.080001 =^ D{S) > 1 - 0.080001 = 0.919999 
0.92//* < fx{S) < 1.08//* =^ 0.4599 < /t(5) < 0.54 

and thus 

f{l^{S), D{S)) = 2 X i^{S) X (D(5) - i^{S)) > 2 X 0.4599 x (0.919999 - 0.54) > 10"^ 
Case II: Remaining Case, i.e., ranki_io-6(G') > n^'^ 

Our strategy is to use the algorithm in Theorem 2.2 to repeatedly extract high-rank parts 
from the given graph until we cannot do so anymore^. Namely, we compute in polynomial 
time an ordered partition of nodes (Ti, ■ ■ ■ , F \ U^^^Tj) such that each Tj is obtained 
by using the algorithm in Theorem 2.2 on graph Gi induced by the set of nodes V \ U*~\Tj, 
and the last (possibly empty) graph G" induced by the set of nodes V" = V \ Uf^^Tj satisfy 
ranki_io-6(G") < ll^"!"-^. Let G' be the graph induced by the set of nodes V = U^^^^Tj. The 
following cases arise. 

Case 11(a) \S*nV"\> \s*\/2. 

Let SI be the set containing an arbitrary 1^*1/2 elements from the set S* fl V". Note that 
pi (SI) = and ^(S^) < 2$*. Using Theorem 2.1 on the graph G" with \S*\ replaced by 
1^*1/2 outputs a set S of nodes from V" such that 

^S) < 2 $* + 0.08 < 0.080002 =^ D(S) > 1 - 0.080002 = 0.919998 
0.46/f* < fx{S) < 0.54/t* =^ 0.229 < fx{S) < 0.27 

and thus 

/(//(5), D{S)) = 2 X ij{S) X ( D{S) - n{S) ) > 2 X 0.229 x (0.919998 - 0.27) > 10"^ 

Case 11(b) \S*nV"\< \sy2. 

Since 1711 < | ^ \ U^\Tj |^"^°~' < 71^-^°"' for any i and \S*\ > (o.5 - n, there exists 

an index i such that ^ - < m^^T, ! < + r^^-^°"^ Notice that the graph induced 

by the set of nodes 5* = ^]=iTj satisfy ^{S) < 0.01 and, as a consequence of the previous 
observation, 0.24 < /t(5) < 0.51. Thus, 

finiS), D{S)) = 2 X n{S) X ( D{S) - n{S) ) > 2 x 0.24 x (0.99 - 0.51) > 10"^ 

^[1] points out how to regularize the remaining graph each time a set of nodes have been extracted by adding 
appropriate number of self-loops of weight 1/2. 
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6 Further Research 



An interesting open question is whether it is possible to prove the converse of Remark 4.1, i.e., 
can an appropriate sub-exponential approximation algorithm for modularity maximization be 
used to design a sub-exponential algorithm for small-set expansions? 
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